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Abstract 

We propose a collision recovery scheme for symbol-synchronous slotted ALOHA (SA) based on 
physical layer network coding over extended Galois Fields. Information is extracted from colliding 
bursts allowing to achieve higher maximum throughput with respect to previously proposed collision 
recovery schemes. An energy analysis is also performed, and it is shown that, by adjusting the trans- 
mission probability, high energy efficiency can be achieved. The paper also addresses several practical 
aspects, namely frequency, phase, and amplitude estimation, as well as partial symbol asynchronism. A 
performance evaluation is carried out using the proposed algorithms, revealing remarkable performance 
in terms of normalized throughput. 
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Index Terms 

Physical-layer network coding; slotted ALOHA; random access networks; collision recovery; 
satellite communications; CRDSA; energy efficient multiple access; spectral efficient multiple 
access. 

I. Introduction 

The throughput of Slotted ALOHA (SA) systems is limited by the collisions that take place when 
more than one node accesses the channel in the same time slot. This limitation is particularly problematic 
in satellite networks with random access, where the long round-trip time (RTT) greatly limits feedback 
from the receiver, for example to perform load control or to request retransmission.. Techniques like 
Diversity Slotted ALOHA (DSA) [1], in which each packet is transmitted more than once, have been 
proposed in order to increase the probability of successful detection. The spectral efficiency of SA 
systems can be increased by exploiting the collided signals. In Contention Resolution Diversity Slotted 
ALOHA (CRDSA) (2 the collided signals are exploited using an iterative interference cancelation (IC) 
process. In CRDSA each packet is transmitted more than once and uncollided packets are subtracted 
from slots in which their replicas are present. In [)3] a packet-level forward error correction (FEC) code 
has been applied to CRDSA, while in |4| a convergence analysis and optimization of CRDSA has been 
proposed. 

Another technique that allows to extract information from colliding signals is physical layer network 
coding (PHY NC). PHY NC was originally proposed to increase spectral efficiency in two-way relay 
communication |5| by having the relay decoding the collision of two signals under the hypothesis of 
symbol, frequency and phase synchronism. Several studies have been reported in the literature about 
synchronization issues, gain analysis and ad-hoc modulation techniques for PHY NC in the case of 
two colliding signals |6]|7||8|. In |9] PHY NC has been applied in the satellite context for pairwise 
node communication. In iflOl and ifTTI it has been proposed to apply PHY NC to determine the identity 
of transmitting nodes in case of ACK collision in multicast networks by using energy detection and 
ad-hoc coding schemes, under the hypothesis of phase synchronous signal superposition at the receiver. 
In lfl2l the decoding of multiple colliding signals over generally complex channels has been studied 
from an information theoretical point of view. In ifPTl PHY NC has been applied for collision resolution 
in ALOHA systems with feedback from the receiver, under the assumption of frequency synchronous 
transmitters. 

In this paper we present a new scheme named Network-Coded Diversity Protocol (NCDP), that 
leverages on PHY NC in extended Galois Fields for recovering collisions in symbol-synchronous 
SA systems. Once the PHY NC is applied to decode the collided bursts, the receiver uses common 
matrix manipulation techniques over finite fields to recover the original messages, which results in a 
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high-throughput scheme. The proposed scheme and analysis differ from previous works on collision 

resolutions at both system (SYS) level and physical (PHY) level: 

SYS: • Unlike in 1131 . we assume that transmissions are organized in frames. We consider two 
different setups. In one, the nodes do not receive any feedback from the receiver. If on the one 
hand the absence of feedback leads to a best-effort scheme, in which there is no guarantee 
for a message to be received, on the other hand it notably simplifies the system architecture 
and decreases the total amount of energy spent per received packet. In the second setup 
that we consider, instead, feedback is allowed from the receiver. In particular, we consider 
an automatic repeat request (ARQ) scheme, in which a node receives an acknowledgement 
(ACK) or a negative acknowledgement (NACK) from the receiver in case a message is or 
is not correctly received, respectively. A message for which a NACK has been received is 
retransmitted in a different frame. The retransmission process goes on until the message is 
acknowledged. 

• We evaluate jointly the spectral efficiency (average number of messages successfully re- 
ceived per slot) and the energy consumption (average amount of energy needed for a 
message to be correctly received) of the proposed scheme and compare it with other collision 
resolution schemes previously proposed in the literature. 

PHY: . We use extended Galois Fields, i.e., GF(2 n ) with n > 2, instead of GF(2), which is 
generally used in PHY NC. This allows to better exploit the diversity of the system, leading 
to increased spectral efficiency and, depending on the system load, to an increased energy 
efficiency. 

« We take into account frequency and phase offsets at the transmitters when applying PHY 
NC for an arbitrary number of colliding signals. Up to our knowledge, the issue of frequency 
offsets in PHY NC has been previously addressed only for the case of two colliding signals. 
See, e.g., [14|, [15| and references therein. 

• We show the feasibility of channel estimation for PHY NC in the presence of more than two 
colliding signals, unlike previous works where only two colliding signals were considered 
(see, e.g., |16|). 

• We study the effect of non perfect symbol synchronism on the decoder FER for an arbitrary 
number of colliding signals and propose four different methods to compensate for such 
effect. 

The rest of the paper is organized as follows. In Section HT1 we present the system model. Section 
HTT1 describes how the channel decoding works in case of a generic number of colliding signals with 
independent frequency and phase offsets. In Section |IV] the proposed scheme is described, while its 
performance is studied in Section [V] in terms of both spectral and energy efficiency. Section [VI] deals 
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with issues such as channel estimation and error detection, which are fundamental for a practical 
implementation of the proposed scheme. Section IVHI is dedicated to the effect of imperfect symbol 
synchronization on the decoder performance in case of multiple colliding signals, and different schemes 
to overcome such effects are presented. In Section fVHII we present the numerical results, while Section 
ITXl contains the conclusions. 



Let us consider the return link (i.e, the link from a user terminal to the satellite/base station) 

of a multiple access system with M transmitting terminals, T\, ,Tm, and one receiver R. Packet 

arrivals at each transmitter are modeled as a Poisson process with rate -p, which is independent from 
one transmitter to the other. Each packet = Ui(K)] consists of K binary symbols of 

information iti(£) £ {0, 1}, for £ = 1, . . . , K. We assume that, upon receiving a message, each terminal 
Ti uses the same linear channel code of fixed rate r = % to protect its message u^, obtaining the 
codeword x s ; = [xi(l), ...,Xi(N)], where Xi(l) <E {0,1} for I = 1,...,N. For ease of exposition a 
BPSK modulation is considered. Each codeword x; is BPSK modulated (using the mapping — >• — 1, 
1 — » +1), thus obtaining the transmitted signal 



where T s is the symbol period, bi(l) is the BPSK mapping of Xi(l) and g(t) is the square root raised 
cosine (SRRC) pulse. The signal Si(t) is called burst. 

In the following we will refer to a time division multiple access (TDMA) scheme. However, the 
techniques proposed in the following can be also applied to other access schemes, such as multi- 
frequency-TDMA (MF-TDMA), in which a frame may include several carriers, or code division multiple 
access (CDMA), where NCDP can be used to recover collisions in each of the code sub-channels. 
It should be noted that the proposed technique still relies on single carrier transmission of each 
user terminal. From the user terminal perspective no significant change is required. Transmissions are 
organized in frames. Each frame is divided into S time slots. The number S of time slots that compose 
a frame is constant, i.e., it does not change from one frame to the other. The duration of each slot is 
equal to about N burst symbols. When more than one burst is transmitted in the same slot a collision 
occurs at the receiver. A collision involving k transmitters is said to have size k. We assume symbol- 
synchronous transmissions, i.e., in case of a collision, the signals from the transmitters add up with 
symbol synchronism at R. The received signal before matched filtering and sampling at R, in case of 
a collision of size k (assuming, without loss of generality, the first k terminals collide), is: 



II. System Model 




(1) 



1=1 



y{t) = fci(t)«i(t) + ... + h k {t)s k {t) + w(t) 



(2) 
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where Si(t) is the burst transmitted by user i, w(t) is a complex additive white Gaussian noise (AWGN) 
process while hi (t) takes into account the channel from terminal i to the receiver, hi (t) can be expressed 
as: 

hi(t) = Aie j ^ A ^ t+ ^\ (3) 

where Ai — \hi\ is a lognormally distributed random variable modeling the channel amplitude of 
transmitter i, while Ai/j and </3j are the frequency and phase offsets with respect to the local oscillator 
in R, respectively. We assume that the amplitude Ai and the frequency offset Ai>j remain constant 
within one frame [2| while ipi is a random variable uniformly distributed in [— it, +tt] that changes 
independently from one slot to the other. The fact that ip^ changes from one slot to the other is due to 
the phase noise at the transmitting terminals [2|. Assuming that the frequency offset is small compared 
to the symbol rate 1/T S (Ai/T s <C 1), the sample taken at time ti after matched filtering of signal y(t) 
is: 

r{U) = hi{ti)q x {ti) + ... + h k (U)q k (ti) + n(tj), (4) 

where q(t) = s(t) © <?(— t), while n(^)'s are i.i.d. zero mean complex Gaussian random variables with 
variance Nq in each component. Note that even in case a BPSK modulation is used, as we are assuming 
in this paper, both the I and Q components of the received signal are considered by the receiver. This is 
because the phases of the users have random relative offsets and thus both components carry information 
relative to the useful signal. The random relative offsets must be taken into account by the decoder, as 
they cannot be eliminated by the demodulator. We consider this more in detail in Section [Til] 

We assume that the receiver has knowledge of the nodes that are transmitting, as well as the 
full channel state information at each time slot. As we are considering a random access scheme, the 
knowledge about nodes identity cannot be available a priori at the receiver. Instead, nodes identity 
must be determined by R starting from the received signal, even in case a collision occurs. This can 
be achieved by having the transmitting nodes adding an orthogonal preamble in each transmitted burst, 
assuming that the probability that two nodes use the same preamble is negligible [2|. We discuss the 
issue of node identification and channel estimation more in detail in Section fVTl 

III. Multi-User Physical Layer Network Coding 

In this section we describe the way the received signal is processed by the receiver R in case of a 
collision. 

When a collision of size k occurs, i.e., k bursts collide in the same slot, the receiver tries to decode 
the bit-wise XOR of the k transmitted messages. This can be done by feeding the decoder with the 
log-likelihood ratios (LLR) for the received signal. The calculation of the LLRs for a collision of generic 
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size k in case of BPSK modulation was presented in fOl . In the following we include the effect of 
frequency offset in the calculation of the LLRs, which was not taken into account in ff3l . 

When signals from k transmitters collide, the received signal at R is given by (0. Each codeword x, 
is calculated from as x 2 ; = C(iij), where C(.) is the channel encoder operator. All nodes use the same 
linear code C(.). Starting from r(t), the receiver R wants to decode codeword x s = xi © X2 © . . . © X&, 
where © denotes the bit-wise XOR. In order to do this the decoder of R is fed with vector L® = 



h(ti) being a column vector containing the channel coefficients of the k transmitters at time ti (which 
change at each sample due to frequency offsets), while d°(2i — 1, m) and d e (2i, m) are column vectors 



of an odd or even number of symbols with value respectively. Equation (O is derived considering 

that an even or an odd number of symbols with value +1 adding up at R must be interpreted by the 
decoder as a or a 1, respectively. The derivation of L®(1) is detailed in the Appendix (see J6) and 
|8| for an extension to higher order modulations). If the decoding process is successful, R obtains the 
message u s = Ui © . . . © u^. In Section [VT1 the FER curves for different collision sizes obtained using 
these LLR values are shown. 



In this section we present our network-coded diversity protocol (NCDP) which aims at increasing 
the throughput and reducing packet losses in Slotted ALOHA multiple access systems. In the first part 
of the section we recall some basics of finite field arithmetics, while in the second part we describe the 
NCDP at the transmitter and at the receiver side. 

A. Basics of Finite Fields 

A finite field is a closed set with respect to sum and multiplication with finitely many elements. 
Finite fields are often denoted as GF(s n ), where s is a prime number, n is a positive integer and GF 
stands for Galois Field. If n = 1 all operations (sum, subtraction, multiplication and division) in the field 
coincide with operations over natural numbers modulo s. If n > 1 the field is said to be an extended 
Galois Field (EGF). In an EGF each element can be represented as a polynomial of degree lower than 
n and coefficients in GF(s). An element in an EGF can be represented using the coefficients of the 
corresponding polynomial representation. Thus, a string of n bits can be interpreted as an element in 
GF(2 n ). Along the same line, a string of N = n ■ L bits, L € Af, can be represented as a vector in an 
L-dimensional space over GF(2 n ) (see fl7l for more details). 
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(5) 



containing one (the m-th) of the ( 2 -_-^) or possible permutations over k symbols (without repetitions) 
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The sum operation in an EGF is done coefficient- wise. The sum of two elements in GF(2 n ) can 
be calculated as the bit-wise XOR of the two n-bits strings corresponding to the two elements to add. 

The product in an EGF can be calculated through polynomial multiplication modulo an irreducible 
polynomial which characterizes the field. Subtraction and division are defined as the inverse operations 
of sum and product, respectively, and calculated accordingly. 

Finally, let us consider a system of linear equations in GF(2 n ) with N tx variables and S equations, 
S > N tx , with an associated S x N tx coefficient matrix A having elements in GF(2 n ). The system 
admits a unique solution iff the associated coefficient matrix A has exactly N tx linearly independent 
columns (rows). 

B. NCDP: Transmitter Side 

Assume that node i has a message to deliver to R during frame /. We call active terminals the 
nodes that have packets to transmit in a given frame. Each message is transmitted more than once within 
a frame, i.e., several replicas of the same message are transmitted. We will give details about the number 
of replicas transmitted within a frame in next section. Before each transmission, node i pre-encodes 
as depicted in Fig. Q] The pre-coding process works as follows. U; is divided into L = — blocks of n 



n bits 



H I ' 1 



u, 




GF(2 n 



nbits a ij U i 



i — — i r 



Si, =M(x i -) 



— Modulation k 



-CXij^- n bits 



u 



K) 



Channel 

Coding 

= C ( U 'ij) 



Fig. 1. NCDP pre-encoding, channel coding and modulation scheme at the transmitter side. The message to be transmitted 
is divided into sub-blocks. Each sub-block is multiplied by a coefficient £ GF(2 n ). Coefficients otij, j £ {1, . . . , S} are 
chosen at random in each time slot. After the multiplication, the message is channel-encoded, a header is attached and the 
modulation takes place. 



bits each. At each transmission a different coefficient ,j 6 {1, . . . , S}, is drown randomly according 
to a uniform distribution in GF(2 n ). If ctij = 0, terminal Tj does not transmit in slot j. Each of the L 
blocks u£, r £ {1, . . . , L}, is interpreted as an element in GF{2 n ) and multiplied by a^ . We call 
the message after the multiplication by ay. is then channel encoded, generating the codeword 
Xij = C(u^). After channel coding, a header pi is added to x.y. Such header is chosen within a set of 
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orthogonal codeword (e.g. Walsh-Hadamard). The same header pi is used for all transmissions of node 
i within frame /, i.e., it does not change within a frame. Once the header is attached, is BPSK 
modulated and transmitted. 

The choice of the coefficients and of the header is done as follows. Node i draws a random number 
/i. fi is used to feed a pseudo-random number generator, which is the same for all terminals and is known 
at R. The first S outputs of the generator are used as coefficients. The header is uniquely determined 
by fi, i.e, there is a one-to-one correspondence between the set of values that can be assumed by /i 
and the set of available orthogonal headers. The orthogonality of the preambles allows the receiver 
to know which of the active terminals in frame / is transmitting in each time slot. Moreover, as the 
header univocally determines /i and thus the set of coefficients used by each node, R is able to know 
which coefficient is used by each transmitter in each slot. As we we will see in Section IIV-C1 this 
is of fundamental importance for the decoding process. As said before, the set of headers is a set of 
orthogonal words, such as those usually adopted in CDMA. The fundamental difference with respect 
to a CDMA system is that in such system the orthogonality of the codes is used to orthogonalize the 
channels and expand the spectrum, while in NCDP the orthogonality of the preamble is used only for 
determining the identity of the transmitting node, which is obtained without any spectral expansion, 
as the symbol rate 1/T S is equal to the chip rate (i.e., the rate at which the modulated symbols are 
transmitted over the channel) 10. 

C. NCDP: Receiver Side 

The decoding scheme at the receiver side is illustrated with an example in Fig. [2] and Fig. [3] In 
the example, a frame with S — 4 slots and N tx = 3 active terminals are considered. In each slot 
the receiver uses the orthogonal preamble of each burst to determine which node is transmitting and 
which coefficient has been used for that burst. As described in Section IIV-B1 the coefficients used by a 
node in each burst are univocally determined by the preamble. The preamble can be determined at R 
using a bank of correlators which calculates in parallel the correlation of the received signal with each 
element in the set of available preambles. The preamble is also used by R to estimate the channel for 
each of the transmitters. The details about the channel estimation are given in Section [VI- Al Once the 
channel has been estimated, the decoder applies PHY NC to calculate the bitwise XOR of transmitted 
messages, as detailed in Section [Hi] The receiver tries to channel-decode the received signals using 
PHY NC. According to what is stated in Section llV-AI and Section HV-Bl the bitwise XOR is interpreted 
as a sum in GF(2 n ). Thus the slots that have been correctly decoded are interpreted as a system of 
equations in GF(2 n ) with coefficients a,j, which are known to the receiver through the headers (see 
Fig. |2|. At this point, if the coefficient matrix A has full rank, R can recover all the original messages 
using common matrix manipulation techniques in GF(2 n ) (see Fig. [3]). If A is not full rank, not all 



March 3, 2013 



DRAFT 











1 






1 1 

^3lS31 





Received 
frame 



Decoder 



u n © u 3i = "ii u i + "3iU3 = h\ 
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Fig. 2. For each of the slots the receiver uses the 
orthogonal preambles to determine the which node is 
transmitting. With the same preamble the channel from 
each of the transmitters in the slot to R is estimated. 
The channel hij, j 6 {1, . . . , S} changes at each slot due 
to phase noise, according to the channel model described 
in Section [TIJ Once the channel has been estimated, the 
decoder applies MU PHY NC to calculate the bitwise XOR 
of transmitted messages. The bitwise XOR corresponds to 
a linear equation in GF(2 n ) with coefficients aij which 
are known to the receiver through the header. In the figure 
only bursts with non-zero coefficients are shown. 
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Fig. 3. The receiver tries to channel-decode all of the 
occupied slots, thus obtaining a system of equations in 
GF(2 n ). At this point, if the matrix A of coefficients is 
full rank, R can obtain all the original messages. If A is 
not invertible, R can decode the "clean" bursts (i.e., the 
bursts that did not experience collision), then subtract them 
from the slots where their replicas are. The procedure goes 
on until there are no more clean bursts. In the figure, T 
represents the transpose operator. 



the transmitted packets can be recovered. However, a part of them can still be retrieved using matrix 
manipulation techniques such as Gaussian elimination. The decoding process in case of rank deficient 
coefficient matrix is analyzed in Section [V] 

V. Throughput and Energy Analysis 

During each frame users buffer packets to be transmitted in the following frame. Each node transmits 
its packet more than once within a frame, randomly choosing a new coefficient in GF(2 n ) independently 
at each transmission. As described in the previous section, the coefficients can be generated using 
a pseudo-random number generator fed with a seed which is univocally determined by the chosen 
orthogonal preamble. Using the preamble the receiver can build up a coefficient matrix A for each 
frame , with Aj^ = ay, ay € {1, ...,2™ — 1}, such as the one represented in Table J] Columns 
represent time slots while rows represent the active terminals, i.e., the terminals that transmit in present 
frame. If = 0, terminal i does not transmit in slot j. During time slot j, R receives the sum of 
the bursts with ay ^ 0. From the received signal, R tries to obtain the bit-wise XOR of the encoded 
messages as described in Section [TT] The XOR is interpreted by R as a linear equation in GF(2 n ), the 
coefficients of which are derived through the orthogonal preamble as described in Section |IV] If N tx 
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TABLE I 

Example of access pattern for three nodes transmitting in a frame with 5 = 4 slots per frame. 
ay e GF(2 n ) is the coefficient used by node i in slot j. Each coefficient can assume one of q = 2™ possible 

VALUES, INCLUDING VALUE 0, WHICH CORRESPONDS TO THE CASE IN WHICH THE TERMINAL DOES NOT TRANSMIT. 
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is the number of active terminals in a frame and assuming that all the received signals are decoded 
correctly, a linear system of equations in GF(2 n ) is obtained with S equations and N tx variables. Each 
variable corresponds to a different source message. If A has rank equal to N tx , then all the messages 
can be obtained by R. A necessary condition for A to be full rank is N tx < S, i.e., the number of active 
terminals in a frame must be lower than the number of slots in a frame. Assuming Poisson arrivals with 
aggregate intensity G, the probability of such event is: 

Pr{N tx <S} = j2 iGS) T GS , (6) 

which includes also the case in which there are no active terminals during a frame. For instance, in 
case of S — 100 slots and G = 0.8 the probability expressed by (|6]i is on the order of 0.99. Even if 
N tx < S, however, it can still happen that A is not full rank, i.e., not all the messages can be recovered. 
The probability that A is full rank for a given N tx < S depends on the MAC policy, and particularly 
on the probability distribution used to choose the coefficients. 

One possibility is to use a uniform distribution for the coefficients (i.e., each coefficient can assume 
any value in {0, . . . , 2" — 1} with probability 2 - ™). In this case the number d of transmitted replicas is 
a random variable, and the probability that A is full rank is |18|: 

P(S,N tx )= [] (1-2*5=*) ■ (7) 
Using © and (O we find the expression for the normalized throughput: 

1 * (GSre~ GS 
* = s 1^ —\ p iS,m) 

in— 1 

~ S ^ (m-1)! 1J- V 2"( s " fe ) 

m=l v > fc=0 v 



k=0 
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From Eqn. ^ we can see that $ grows with n, which means that the system throughput increases with 



the size of the considered finite field. Moreover, we have: 



r s-i 



(GS) m e 



GS m 




1 



) 



lim $ = lim G 



ml 



2 n(S-k) 



n— >oo n— »oo ^ — ' 

m=0 




GS 



(9) 



From Eqn. (O it can be seen that the normalized throughput $ tends to the probability of having less 
than S transmitters in a frame as n — > oo. 

The MAC scheme we just analyzed presents one main drawback in terms of the energy efficiency 
of the protocol. As a matter of fact, given the frame length S, a node transmits each message on average 
E[d] = S x p times, p = (1 — 2~") being the probability to choose a non-zero coefficient, i.e., the 
average number of transmissions grows linearly with S. In order to decrease the energy consumption, the 
probability of choosing the zero coefficient may be increased. However, a reduction in the transmission 
probability p may affect the system throughput. In order to understand the relationship between the 
probability p and the throughput $, we refer to some results in random matrix theory. The problem 
can be formulated as follows: consider an N tx x S random matrix A over GF(2 n ) with i.i.d. entries, 
each of which assumes value with probability p while with probability 1 — p it assumes values in 
{1, . . . , 2" — 1}. We are interested in the relationship between p and the probability that A is full rank. 
In lfl9l the authors show that, if we want to achieve a rank N tx — 0(1) with high probability, then, for 
N tx large, p cannot be lower than log ^L - . At high loads (i.e., G ~ 1), on average N tx ~ S, which 
means that, setting p = lo B ji S ' , the average number of transmissions (and so the energy consumption) 
for each node is E[d] = log(S'), i.e., it grows logarithmically with the number of slots in a frame. On 
the other side, S must be kept large enough, as this increases the decoding probability, which makes 
the choice of small S unpractical. With reference to the example considered earlier in this section, in 
which S — 100, the average number of transmissions corresponding to the minimum required p is equal 
to about 4.6. We evaluated numerically the effect a reduction of p has on $ for the case S = 100 and 
q = 2 8 . We considered three cases. In the first one the transmission probability in each slot has been 
set to p = 1 — 2~" = 0.9961, which corresponds to the case studied in the first part of this section and 
for which the throughput is given by Eqn. ©. In the second case we set p just above the threshold, 
i.e., p = 0.0625 > l -2S^l = 0.0461, while in the last case p has been set exactly equal to the threshold 
probability. Fig. |4] shows the results together with the numerical validation of Eqn. (©. It is interesting 
to note how passing from p = 0.9961 to p = 0.0628, with a reduction in transmission probability (or, 
equivalently, in average energy per message) of about 93.7%, leaves the throughput unchanged, while 
a further decrease of p of just another 1.5% leads to a 10% reduction in the maximum throughput with 
respect to the case p — 0.9961. 
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Fig. 4. Normalized throughput plotted against the normalized offered load for different values of the transmission probability 
p. We set S — 100 slots per frame while the coefficients were chosen in GF(2 8 ). 



To further lower the energy consumption and control the number of repetitions d (which, being 
a Bernoulli random variable, can theoretically assume values as large as S), an alternative is to fix 
the number of transmitted replicas a priori. Although this solution may lead in some cases to the 
impossibility of decoding all the transmitted messages, it may still be possible to recover many of them 
by using Gaussian elimination. 

VI. Implementation Aspects 
A. Channel Estimation and Node Identification 

For each frame the receiver R needs to know which of the active terminals is transmitting in each 
slot and must have channel state information for each of the users. Both needs are addressed including 
an orthogonal preamble, such as the spreading codes used in CDMA, at the beginning of the burst. The 
use of an orthogonal preamble was proposed in [2] for the estimation of the phase in collided bursts. 
In 12) frequency offset and channel amplitude are derived from the clean bursts (i.e., bursts that did not 
experience collisions) and assumed to remain constant over the whole frame. Unlike in [2|, the method 
we propose does not rely only on clean bursts. Thus the frequency offset and the amplitude of each 
transmitter must be estimated using the collided bursts for each frame. Although the performances of 
the estimator are likely to degrade with respect to the clean burst case, especially in case of high order 
collisions, the estimation can leverage in the information of all the collided bursts, which improves the 
estimation. For instance, if a packet is transmitted twice during a given frame and experiences collisions 
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of order 2 in the first transmission and 4 in the second, the two estimations can be combined to obtain 
a better estimation of amplitude and frequency offset, which are constant during the whole frame. 

In order to prove the feasibility of channel estimation in such conditions we show the results 
we obtained using the Estimate Maximize (EM) algorithm. We adopted the approach described in ll20l . 
where the EM algorithm is used to estimate parameters from superimposed signals. In 11201 two examples 
were proposed related to multipath delay estimation and direction of arrival estimation. We apply the 
same approach to estimate amplitudes, phases and frequency offsets from the baseband samples of the 
received signal in case of a collision of size k. The algorithm is divided into an E step, in which each 
signal is estimated, and an M step, in which the mean square error between the estimation made at the 
E step of current iteration and the signal reconstructed using parameters calculated in previous iteration 
is minimized with respect to the parameters to estimate. Formally, once initialized the parameters with 
randomly chosen values, at each iteration we have the following two steps: 

Estimation step - for i — 1 . . . . , k calculate 



p^\t) = bi(t)A\ n) e^ Av * n)T ° t+ ri n) ) 



-(f)-£>(t)4 (n) e 
i=i 



(n) j(27rA l /r ) T 3 t+^ < " ) ) 



(10) 



Maximization step - for i = 1, . . . , k calculate 



min Y Mt^it) - A'e^ A »' T ° t+ ^ ~ 



(ii) 

t=i 

where pi(t) is the preamble of burst i after the matched filter, A', /S.v' and ip' are tentative values for 
the parameters to be estimated, N pre is the preamble length, bi(t) E {±1} is the t-th symbol in the 
preamble of the i-th node and T s is the sampling period, taken equal to the symbol rate. /3j are free 
parameters that we arbitrarily set to /3, ; =0.8, for i — 1, . . . , k. 

We evaluated numerically the performance of the EM estimator assuming that phase offsets are 
uniformly distributed in [— tr, +tt], frequency offsets are uniformly distributed in [0, ls.v max ] with ^.ii rnax 
equal to 1% of the symbol rate on the channel (1/T S ), and amplitudes are log-normally distributed. 
Figures [5] |6] and [7] show the mean squared error (MSE) of the estimation error for frequency, phase 
and amplitude, respectively. Amplitude error is normalized to the actual amplitude value while phase 
error is normalized to tt. In the simulations we used as preambles Walsh-Hadamard words of length 
128 symbols. The EM algorithm was run twice starting from randomly chosen initial values of the 
parameters and taking as result the values of the parameters that lead to the minimum of the sum across 
the signals of the error calculated in the last E step. This was done in order to reduce the probability 
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Fig. 5. Mean squared error (MSE) of the frequency offset estimation, i.e., £7[|Az/ — Ai/| 2 ]. E s is the average energy per 
transmitted symbol for each node. The modified Cramer-Rao lower bound (MCRLB) for the case of one transmitter is also 
shown for comparison. 




10 
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Fig. 6. MSE of the phase offset estimation normalized to tt, i.e., E[\0 — (p\ 2 ]/ir 2 . E s is the average energy per transmitted 
symbol for each node. 



to choose a "bad" local maximum, which is a problem that affects all the "hill climbing" algorithms. 
For each run 6 iterations were made. 

In Fig. [8] the FER curves for different collision sizes obtained using the LLR values calculated 
in Section [Til] are shown. The plots are obtained using a tail-biting duo-binary turbo code with rate 
1/2 and codeword length equal to 1504 symbols. The phase offsets tfi are random variables uniformly 
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Fig. 7. MSE of the amplitude estimation normalized to the actual amplitude of the channel, i.e, E[\A — A\ 2 /A 2 ]. E a is the 
average energy per transmitted symbol for each node. 



distributed in [—%, +tt] while frequency offsets are uniformly distributed in [0, Av max ] with Av max 
equal to 1% of the symbol rate 1/T S . The FER curves for the case of estimated channels using the EM 
algorithm are also shown. 
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Fig. 8. FER for the XOR of transmitted messages for different numbers of transmitters. Eh is the energy per information bit 
for each node. A tail-biting duo-binary turbo code with rate 1/2 and codeword length 1504 symbols is used by all nodes. Phase 
offsets are uniformly distributed in [— 7r, +tt], frequency offsets are uniformly distributed in [0, Ai/ max ] with Ai/ max equal 
to 1% of the symbol rate on the channel. Amplitudes are constant and equal to 1. The FER curves for the case of estimated 
channels using the EM algorithm are also shown. 
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B. Error Detection 

An important issue in slotted ALOHA is the capability of the receiver to determine whether the 
received bursts are correctly decoded or not. This is particularly important in NCDP, where the error 
made in the decoding of a collision can propagate possibly leading to the loss of a whole frame. A 
common practice in packet networks is the use of a cyclic redundancy check (CRC), which allows to 
detect a wrong decoding with a certain probability. Some CRC's are based on a field which is appended 
to the message before channel coding, called CRC field. As the CRC operations are done in GF(2) and 
by the linearity of the channel encoder, the CRC field in the message obtained by decoding a collision 
of size A: is a good CRC for u s , which is the bitwise XOR of the messages encoded in the k collided 
signals. This allows to detect decoding errors, within the limits of the CRC capabilities, also in collided 
bursts. The implementation aspect of what type of CRC should be used is out of scope of this paragraph. 

VII. Performance of Multi User Physical Layer Network Coding with 
Imperfect Symbol Synchronization 

In Section|II]we assumed that signals from different receivers add up with symbol synchronism at the 
receiver in case of a collision. In Fig. |9]an example is shown of received signal and sampling instants in 
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Fig. 9. Received signal after the matched filter in case of three colliding bursts with no timing offsets, i.e., ATi = AT2 = 
AT3 = 0. The transmitted signals after the matched filter in case of collision-free reception are also shown. The transmitted 
symbols are: [-1 1 -1], [-1 1 1] and [-1 -1 -1] for transmitter 1, 2 and 3, respectively. For sake of clarity, frequency and phase 
offsets as well as channel amplitudes were not included in the plot and the signals were considered as real. The samples, shown 
with grey circles in the figure, are taken at instants corresponding to the optimal sampling instants for each of the signals as if 
they were received without experiencing collision. 



the case of three nodes transmitting with no timing offsets. The transmitted signals, which are also shown, 
modulate the sets of symbols [-1 1 -1], [-1 1 1] and [ -1 -1 -1]. The situation depicted in the figure is an 
illustrative one, as in a real system both I and Q signal components are present, signals may have different 
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amplitudes, phase and frequency offsets for each of the bursts and the signal is immersed in thermal 
noise. However, in a real system there will always be a certain symbol misalignment, which grows larger 
as the resources dedicated to the synchronization phase diminish (see, e.g., ETI and references therein 
for examples of synchronization algorithms). Being able to cope with non perfect symbol synchronism 
can bring important advantages, such as less stringent constraints on signal alignment, with consequent 
savings in terms of network resources needed for the synchronization. In this section we study the effect 
of non perfect symbol synchronization and propose possible countermeasures. Let us consider a slotted 
multiple access with k nodes accessing the channel at the same time. We assume that each transmitter 
has its own phase and frequency offsets. We further assume that each burst falls completely within the 
boundaries of a time slot, i.e., no burst can fall between two consecutive time slots. Let us call T' the 
time at which the peak of the first symbol of the bursts that first arrives at R. We define the relative 
delay (RD) AT of node i as the temporal distance between the peak value of the first pulse of burst 
i and T". In other words, the burst which arrives first at the receiver is used as reference, i.e., has RD 
equal to 0. We assume SRRC pulses with roll off factor a are used. We further assume that all RD's 
belong to the interval [0, AT max ], with < AT max < T„/2. 

In case of a collision of k bursts, the received signal before the matched filter is: 

fe 

y(t) = Y f s i (t) + w(t), (12) 

i=l 

where, 

JV 

Si (t) = Ai^2bi(l)g(t - IT, - AT)e^ A ^ t+ ^\ (13) 
i=i 

N being the number of symbols in the burst, git) is the square root raised cosine pulse and w(t) 
represents an AWGN process. The samples taken after the matched filter at times ti are: 

fe 

r(i,) = y(t)®ff(-t) |t=t,= $^3i(tj)+n(tj)> (14) 

i=l 

where, 

N 

qi (t,) = At J2 MQP(*I " lT * ~ AT)e^ A ^'+^\ (15) 
i=i 

pit) being the raised cosine pulse, ® is the convolution operator and n{t) is the noise process after 
filtering and sampling. Note that in (fTBT l the exponential term is treated as a constant. This approximation 
is done under the assumption that AvT s -C 1, i.e., the exponential term is almost constant over many 
symbol cycles. 

The sampled signal is then sent to the channel decoder. It is not clear at this point which is the 
optimal sampling time, as the optimal sampling time for each of the bursts taken singularly may be 
different. Moreover, sampling the signal just once may not be the optimal choice. Actually, as we will 
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show in next section, the performance of the decoder is quite poor in case a single sample per symbol 
is taken. 

In the following we propose several techniques to mitigate the impairment due to imperfect symbol 
synchronization. We assume that R has knowledge of the relative delays of all the transmitters, which 
can be derived through the orthogonal preambles. We further assume that R has perfect CSI for each 
of the transmitters. Without loss of generality and for ease of exposition, from now on we will refer to 
the sampling time for the symbol number 1 . 

A. Single sample 

a) Mean Delay: The first method we present is Mean Delay (MD). In MD the received 
signal is sampled just once per symbol. The sampling time is chosen to be the mean of the relative 
delay, i.e.: 



The sample r(T MD ) is then used to calculate the LLR's as in Eqn. (0. ISI is not taken into account. 
B. Multiple samples 

In the following we describe four different methods that use k samples per symbol, k being the 
collision size. 

We start by describing two methods in which the symbol is sampled k times in correspondence 
of the RD's. Due to the non perfect synchronization, when the signal is sampled in AT^ the sample 
obtained is the sum of the first symbol of each of the users, weighted by the relative channel coefficient, 
plus a term of ISI due to signals Sj,j € {1, . . . , k},j ^ i, which are sampled at non ISTfree instants. 
As the LLR's need the channels of each of the users, the ISI should be taken into account. However, 
the ISI is a function of many (theoretically all) symbols, and can not be taken into account exactly. In 
Fig. QJJ the received signal after the matched filter is shown in the case of three colliding bursts with 
timing offsets ATi = 0, AT 2 = T s /6 and AT 3 = T s /4. The transmitted signals after the matched filter 
in the case of collision-free reception are also shown. The symbols transmitted by each terminal are 
the same as in Fig. [9] The samples, shown with grey circles in the figure, are taken in correspondence 
of the RD's, which coincide with the optimal sampling instants for each of the signals as if they were 
received without experiencing collision. 

b) Mean LLR: In Mean LLR (ML) the received signal is sampled k times in the instants 
correspondent to AT;, i = 1, . . . , k. For each of the samples the LLR's are calculated as in (|5j. Then 
the average of the k LLR's is passed to the decoder. 




(16) 



m— 1 
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Fig. 10. Received signal after the matched filter in case of three colliding bursts with timing offsets AZi = 0, AT2 = T s /6 
and AT3 = T s /4. The transmitted signals after the matched filter in the case of collision-free reception are also shown. The 
transmitted symbols are: [-1 1 -1], [-1 1 1] and [-1 -1 -1] for transmitter 1, 2 and 3, respectively. The samples, shown with grey 
circles in the figure, are taken at instants corresponding to the optimal sampling instants for each of the signals as if they were 
received without experiencing collision. Unlike in the case of perfect symbol alignment, here more than one sample per symbol 
is taken. 

c) Mean Sample: As in ML, also in Mean Sample (MS) r(t) is sampled k times in 
correspondence of the relative delays. The difference between the two methods is that in MS the 
samples are averaged out to obtain the mean sample: 

m = ^ r ( AT m)- (17) 

m— 1 

Finally, r(t) is used in the (0 instead of r(t). 

d) Uniform Sampling: In Uniform Sampling (US) the signal is sampled k times as in 
previous methods, but the sampling times do not correspond to the RD's. The sampling times are 
chosen uniformly in [0, AT max ], i.e, in case of k transmitters the samples are taken at intervals of 
AT max /(k — 1). Then, as in MS, the samples are averaged out and used in the calculation of the 
LLRs. This method has the advantage that receiver does not need the knowledge of the RD's in order 
to decode and the sampling itself is simplified as it is done uniformly in each symbol. 

e) Equivalent Channel: The received signal is sampled k times in the instants corre- 
spondent to AT!;, i = 1, . . . , k. In the method Equivalent Channel (EC) the amplitude variation of the 
channel of each user due to imperfect timing is taken into account for the current symbol. Note that the 
ISI is not taken into account, but only the variation in amplitude of present symbol due to imperfect 
timing is accounted for. Assuming that the received signal is sampled at time t — AT!;, then the channel 
coefficient of burst q that is used in the LLR is: 

h e q q {t) = A g e^ 27rA ^ TsAT<+v ^p(ATi - AT,), (18) 
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p(t) being the raised cosine pulse. After the sampling, the k samples per symbol are averaged together 
and used in the LLR instead of r(t). This sampling procedure is equivalent (apart from the ISI) to 
filtering the received signal using a filter which is matched not to the single pulse, but to the pulse 
resulting from the delayed sum of M pulses. In Fig. [TT] the frame error rate is shown for the case of 5 
transmitters with delays uniformly distributed in [0, T s /A]. Constant channel amplitudes were considered, 
while phases and frequency offsets are i.i.d. random variables in [0, 2ir] and [0, Av max ] respectively, 
where Au max is equal to 1/(100T S ). The results for the 5 different methods are shown together with 
the FER for the case of ideal symbol synchronism. The methods that use more than one sample per 
symbol perform significantly better than MD, which uses only one sample per symbol. Among the 
methods based on oversampling, MS and EC perform slightly better than the other two. The FER of 
all methods present a lower slope w.r.t. the ideal case. The loss is about 1 dB at FEB, = 1CP 2 for the 
methods that use oversampling. 




Fig. 11. Frame error rate for decoding a collision of size 5 with independent frequency and phase offsets across the transmitters 
and delays uniformly distributed in [0, T s /4]. A roll-off factor of a — 0.35 was used. The results for the 5 different methods 
are shown together with the FER for the case of ideal symbol synchronism. Oversampling significantly improves the FER with 
respect to the case of single sample. The two methods that exploit knowledge of relative delays, i.e, MS and EC, perform slightly 
better than the others. The FER of all methods present a lower slope w.r.t. the ideal case, losing about 1 dB at FER — 10~ 2 
for the methods that use more than one sample. 
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VIII. Numerical Results 

In this section we present the numerical results. Our performance metrics are the normalized 
throughput $ defined as: 

$ = G(1-T), (19) 

where T g [0, 1] is the average packet loss rate (i.e, the ratio of the number of lost packets to the total 
number of packets that arrive at the transmitters), and the average energy consumption per received 
message 77, defined as the average number of transmissions needed for a message to be correctly 
received by R. We consider two benchmarks. The first one is a system that implements the contention 
resolution diversity slotted ALOHA (CRDSA) protocol, which has been proposed in 0. In CRDSA a 
node transmits two or more copies of a burst (twin bursts) in different slots randomly chosen within a 
frame. Each of the twin bursts contains information about the position of the other twin bursts in the 
frame. If one of the twin bursts does not experience a collision (i.e, it is clean) and can be correctly 
decoded, the position of the other twin bursts is known. These bursts may or may not experience a 
collision with other bursts. If it happens, these are removed through interference cancelation using the 
decoded bursts. In order to do this R memorizes the whole frame, decodes the clean bursts, reconstructs 
the modulated signals and, once the effect of each user's channel has been included in the reconstruction, 
they are subtracted from the slots in which their replicas are located. The IC process is iterated for a 
number N lter of times, at each time decoding the bursts that appear to be "clean" after the previous 
IC iteration. The second benchmark is a slotted ALOHA system. 

We consider two different setups. In one, the nodes do not receive any feedback by the receiver, 
while in the second setup R gives some feedback to the active terminals. For this last case we consider 
an automatic repeat request (ARQ) scheme, in which a node receives an acknowledgement (ACK) 
or a negative acknowledgement (NACK) from the receiver in case a message is or is not correctly 
received, respectively. An alternative to the NACK is to having the transmitters using a counter for 
each transmitted packet, indicating the time elapsed since it has been transmitted. If the timer exceeds 
a threshold value (which depends on the system's RTT), the message is declared to be lost. A node 
that receives a NACK (or whose timer exceeds the threshold vale) enters a backlog state. Backlogged 
nodes retransmit the message for which they received the NACK in another frame, uniformly chosen at 
random among the next B frames. We call B the maximum backlog time. The process goes on until the 
message is acknowledged [22 1 . In both setups we assume a very large population of users. Furthermore, 
we assume that the average SNR is high enough so that the FER at the receiver is negligible. 

In the first setup, in which no feedback is provided by the receiver, the average amount of energy 
spent by a node for each message which is correctly received does not change with the system load 
G, and is equal to the average number of times a message is repeated within a frame. In Fig. [12] the 
normalized throughput $ is plotted against the normalized traffic load G. The normalized traffic load is 
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Fig. 12. Normalized throughput $ vs normalized traffic load G. The normalized traffic load is the average rate at which new 
messages are injected in the network, and is independent from the number of times a message is repeated within a slot. In the 
simulation the frame size was set to S = 150 slots. No feedback was assumed from the receiver. 



the average rate at which the new messages (i.e, messages which are being transmitted for the first time) 
are injected in the network, and is independent from the number of times a message is repeated within 
a slot. In the figure, the throughput curves of NCDP and CRDSA schemes in case of d = 2 and d = 3 
replicas are shown. The throughput curve for NCDP in case of a constant retransmission probability 
p = 0.0453 is also shown. Note that this probability is above the threshold value we mentioned in 
Section [VJ as for S = 150 we have \og(S)/S — 0.0334. The scheme with p = 0.0453 outperforms 
all the others in terms of throughput, achieving a peak value of about 0.8. It is interesting to note 
how increasing the number of transmissions per message (and so the energy consumption) leads to an 
increase in the peak throughput of the system. However, $ increases about 0.2 when passing from d = 2 
to d = 3 repetitions, while the increase in the peak throughput is only about 0.05 when passing from 
d = 2 repetitions per message to an average of E[d) = 6.795 in case of a fixed transmission probability. 

In the second setup, in which retransmissions are allowed, we evaluate jointly the spectral efficiency 
(average number of messages successfully received per slot) and the energy consumption (average 
number of transmissions needed for a message to be correctly received) of the schemes under study. In 
Fig. QjJ <£> is plotted against G for a frame size S = 150 slots and a maximum backlog time B = 50 
frames. The figure shows how <f> increases linearly with G up to a threshold load value. Such threshold 
increases with the (average) number of repetitions of the considered scheme. The <£> curve of NCDP 
upperbounds that of CRDSA. The reason for this lies in the way the decoding process is carried out 
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by the receiver R in NCDP. R first tries to decode the whole frame, which is feasible if the coefficient 
matrix A has rank N tx . If the whole frame can not be decoded, then R applies Gaussian elimination on 
A, in order to recover as many messages as possible. It can be easily verified that Gaussian elimination 
in NCDP is the equivalent, in a finite field, of the IC process of CRDSA, which is applied in the analog 
domain. In order to compare jointly the spectral and the energy efficiency of the different schemes, 




Fig. 13. Normalized throughput <E> vs normalized traffic load G in a system with retransmission. In the simulation the frame 
size was set to S = 150 slots while the maximum backlog time was set to B = 50 frames. 

we plot the curves for the normalized throughput vs the average energy consumption per received 
message 77, which is shown in Fig. [14] The increase in throughput coming from an increased number of 
transmissions implies a higher energy consumption for a given transmitter in a given frame. However, 
this does not necessarily implies a loss in energy efficiency. As a matter of facts, the simulation results 
we are going to present show that there is not a scheme that outperforms the others in terms of both 
energy and spectral efficiency, but which scheme is best depends on the maximum throughput we want 
to achieve. In Fig. [14] we see that SA achieves a higher throughput with a lower energy consumption 
with respect to the other schemes in the region $ < 0.35. In the region $ > 0.35, instead, both NCDP 
and CRDSA achieve a higher throughput with lower energy consumption with respect to SA. NCDP and 
CRDSA behave almost in the same way in the case of 2 repetitions, achieving a maximum throughput 
of 0.5 for an average energy consumption of 2. In the case of 3 repetitions NCDP achieves a maximum 
$ of 0.7, higher than CRDSA, for which the peak value is 0.6, for 77 = 3. In the NCDP scheme with 
a retransmission probability of p = 0.0453 a peak throughput of 0.8 is achieved in correspondence of 
an average energy consumption of 77 = 6.795. For comparison, we also show the throughput-energy 
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Fig. 14. Normalized throughput vs average energy consumption per decoded message for S = 150 and B — 50 frames. 



curve for NCDP in case of p = 0.9961, i.e., coefficients a are chosen uniformly in GF(2 8 ). The high 
p leads to a high throughput, but also to a high energy consumption, with a minimum of r/ = 149.415. 
Moreover, we note that the gain with respect to the scheme with p — 0.0453 is negligible (about 5%), 
especially when compared to the energy saving of about 95% of this last one. 

IX. Conclusions 

We have proposed a new collision recovery scheme for symbol- synchronous slotted ALOHA 
systems based on PHY layer NC over extended Galois Fields. This allows to better exploit the diversity 
of the system, leading to increased spectral efficiency and, depending on the system load, to an increased 
energy efficiency. We have compared the proposed scheme with two benchmark schemes in two different 
setups. One is a best-effort setup, in which the nodes do not receive any feedback from the receiver. In the 
second setup feedback is allowed from the receiver and an ARQ mechanism is assumed. In the second 
setup we have evaluated jointly the spectral efficiency and the energy consumption of the proposed 
scheme and compared it with other collision resolution schemes previously proposed in the literature. 
Once the PHY layer NC is applied to decode the collided bursts, the receiver applies common matrix 
manipulation techniques over finite fields, which results in a high-throughput scheme. The increase in 
throughput coming from an increased number of transmissions implies a higher energy consumption 
for a given transmitter in a given frame. However, this does not necessarily implies a loss in energy 
efficiency. We showed that NCDP achieves a higher spectral efficiency with respect to the considered 
benchmarks, while there is not a single scheme that outperforms the others in terms of both energy and 
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spectral efficiency, but the best scheme depends on the maximum achievable throughput. 

Furthermore, we carried out an analysis of several physical layer issues related to multi-user PHY 
NC. We extended the analysis on and proposed countermeasures against the effects of physical layer 
impairments on the FER when applying PHY NC for a generic number of colliding signals. In particular, 
we took into account frequency and phase offsets at the transmitters which, up to our knowledge, have 
been previously addressed only for the case of two colliding signals. Finally, we showed the feasibility 
of channel estimation for PHY NC in the presence of more than two colliding signals and studied the 
effect of non perfect symbol synchronism on the decoder FER, proposing four different methods to 
compensate for such effect. Up to our knowledge, this kind of analysis has been carried out only for 
the case of two colliding signals and mainly in the context of two-way relay communication. 

Appendix 

Starting from the samples r{t{] the receiver R wants to decode the codeword x s = x 1 ©x 2 ®. . .©x^., 
where denotes the bit-wise XOR. In order to do this we must feed the decoder of R with the vector 
L® = {L®(l),...,L®(N)} of LLRs for x s . We have: 

Pr [x s {l) = l\r{ti)] 



L®{1) 4 In 



Pr [x s (l) = 0\r(ti)} 



= Inj^hSiMi. (20) 



Pr [r(t*)M0=0] 

The last equality follows from the symmetry of the XOR operator provided that x 3 -(l)'s are independent 
and identically distributes (i.i.d.) with Pr[xj(l) = 1] = Pr[xj(l) = 0] = k. Equation (|20b reduces to 
the calculation of the ratio of the likelihood functions of r{ti) for the cases x s (l) = 1 and x s (l) = 0. 
We indicate these functions as fi(r(ti)) and fo(r(ti)) respectively. Functions /o(r(t;)) and fi(r(ti)) 
are Gaussian mixtures: 

2 _fc L^J ( 2 i-l) |,-( tl )-d°(2 1 -l, m) T h(t,)| 2 

h(rm " " s • <21) 

v u %—\ m—l 

h(ti) being a column vector containing the channel coefficients of the k transmitters at time ti (which 
change at each sample due to frequency offsets), while d°(2i— 1, m) is a column vector containing one 
(the m-th) of the ( 2 -_ x ) possible permutations over k symbols (without repetitions) of an odd number 
(2i — 1) of symbols with value As for the case with x s — we have: 

« r(ti)) - 7m £ 5> • <22) 

v u i—\ m—l 

where d e (2i,m) is a column vector containing one (the m-th) of the possible permutations over k 
symbols (without repetitions) of an even number (2i) of symbols with value Finally using (fJTJ 
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and d22i > in ( |20b we find the following expression for the LLR: 



L®(1) = In 



I k + l I k \ r(t i )-d°(2 I -l, m ) T h(t i 
\pL-2-J \r~\2i-l) 2N7 

2^ii=l Z^m=l e 

I k+l I (k\ |r(t i )-d-=(2,, m )r h (f i )| 2 

eIt 1 E [ die ^ 



(23) 
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